Skip to content

Conversation

@ReinierKoops
Copy link
Contributor

@ReinierKoops ReinierKoops commented Mar 14, 2025

This PR contains:

  • Type checking for all methods (add literal checks, re-evaluate some checks, similar methods have similar expected inputs and outputs - for example, plot should return fig)
  • Update the figure produced by running SHAPRFECV (it is a discrete variable - not continuous).
  • Added some missing default parameters to functions (that were now passed via kwargs, i.e. groups for fit_compute)
  • Added more comments & docstring to better explain what is going on. (TODO: Re-evaluate all docstrings before merging the PR)
  • Updated the shap calculation function. Now better supports multiclassification, with aggregation, weighing and class selection options.
  • Added TQDM progress bar + manage better the extensive logging of some other libraries. Integrate that better (+ fix: Log in docs is too long #274).
  • Made the unit tests more robust (remove some, updating others, added some, specific for classes - catboost, lgbm, xgboost, regression, classification, random forest, others, cv, multiclass, ...) (TODO: Merge fixtures which are similar, Make sure all fixtures are in conftest.py, merge tests which have overlapping functionalities that are tested, reduce the number of tests, reassess added assertions if they can be removed or merged, re-introduce tests from the old Probatus that test for specific bugs and edge cases that make sense to test)
  • Update to Numpy 2+ & Shap latest (and other package updates)
  • Backwards compatibility for Numpy 1.*
  • Update docs, notebooks: add examples of multiclass + regression (and fix: Log in docs is too long #274), other functionalities like different multiclass options, overview of the new plots, examples of legacy imports & newer imports, intricacies of the sample size; effect on speed and additivity check, notebook/doc on the changes in this new version
  • Use newer plots from SHAP (replace summary_plot by bar plot), add consistent styling to plots that look (more) modern
  • Integrate the use of Pipelines
  • Update readme to include all the latest changes
  • Remove deprecations, old dead code and try to fix warnings that can be fixed

@PaulZhutovsky

@ReinierKoops ReinierKoops changed the title Simplify & Improve Probatus code Added Type-hinting to methods Mar 15, 2025
@ReinierKoops ReinierKoops changed the title Added Type-hinting to methods Added Type-hinting to methods + Update to Numpy v2 Mar 15, 2025
@ReinierKoops
Copy link
Contributor Author

At the moment catboost does not work. This will be fixed in the newer version of Probatus. Expect catboost to update to numpy v2 any day.

@detrin
Copy link
Contributor

detrin commented Mar 17, 2025

If you have a spare time I would definitely give it a try. I normally use uv, but for debugging of dependencies, poetry is more suitable.

@ReinierKoops
Copy link
Contributor Author

ReinierKoops commented Apr 4, 2025

Still waiting for the catboost release. can be any moment it seems. In the meantime performing some refactoring (making wrapper classes for estimators, datasets and shap to simplify the implementation and detangle logic)

@ReinierKoops
Copy link
Contributor Author

ReinierKoops commented Apr 8, 2025

Started making separate plotting classes. This will allow standardization (style, output) of plots among the main classes (rfe, resemblance, interpretability) and also extend the possible plots that can be plotted from each of those classes

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants